Skip to content

Split ev_op/init#526

Draft
AkshatRai07 wants to merge 11 commits into
masterfrom
refactor/split-quad-ker
Draft

Split ev_op/init#526
AkshatRai07 wants to merge 11 commits into
masterfrom
refactor/split-quad-ker

Conversation

@AkshatRai07
Copy link
Copy Markdown
Collaborator

@AkshatRai07 AkshatRai07 commented Jun 3, 2026

a step for #518

  • split src/eko/evolution_operator/__init__.py into two parts: the new part (quad_ker.py) containing the integration kernels only and the remaining part containing only the class
  • this leaves the Python workflow untouched, but pushes the newly proposed Rust workflow
  • this will (hopefully, to be confirmed) only make it necessary for the Rust-activation-hack to patch this new part
  • the new strategy will address:
    • make Rust independent from Python/Numba (only the reverse dependency remains)
    • in particular: no more "passing numba function pointer around"
    • Rust will only get the variables it (currently) needs (once Rust controls the full integration kernel in the future it will of course receive again everything)

(EDIT: this would have been better placed in a comment, since it is not describing the PR but rather one specific problem)

@felixhekhorn @scarlehoff I have one small issue with this and hence I'd like to now if you are facing this or not. So after changing all the files I tried to run poe test and got this error:

Poe => pytest tests
ImportError while loading conftest '/home/akshatrai/Documents/GSoC/eko/tests/conftest.py'.
tests/conftest.py:11: in <module>
    from eko import interpolation
src/eko/__init__.py:6: in <module>
    from .runner import solve
src/eko/runner/__init__.py:4: in <module>
    from .managed import solve
src/eko/runner/managed.py:19: in <module>
    from . import operators, parts, recipes
src/eko/runner/parts.py:15: in <module>
    from .. import evolution_operator as evop
src/eko/evolution_operator/__init__.py:25: in <module>
    from .quad_ker import quad_ker
src/eko/evolution_operator/quad_ker.py:631: in <module>
    @nb.cfunc(
.venv/lib/python3.11/site-packages/numba/core/decorators.py:275: in wrapper
    res.compile()
.venv/lib/python3.11/site-packages/numba/core/compiler_lock.py:35: in _acquire_compile_lock
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/ccallback.py:68: in compile
    cres = self._compile_uncached()
           ^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/ccallback.py:82: in _compile_uncached
    return self._compiler.compile(sig.args, sig.return_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/dispatcher.py:84: in compile
    raise retval
.venv/lib/python3.11/site-packages/numba/core/dispatcher.py:94: in _compile_cached
    retval = self._compile_core(args, return_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/dispatcher.py:107: in _compile_core
    cres = compiler.compile_extra(self.targetdescr.typing_context,
.venv/lib/python3.11/site-packages/numba/core/compiler.py:739: in compile_extra
    return pipeline.compile_extra(func)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/compiler.py:439: in compile_extra
    return self._compile_bytecode()
           ^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/compiler.py:505: in _compile_bytecode
    return self._compile_core()
           ^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/compiler.py:484: in _compile_core
    raise e
.venv/lib/python3.11/site-packages/numba/core/compiler.py:473: in _compile_core
    pm.run(self.state)
.venv/lib/python3.11/site-packages/numba/core/compiler_machinery.py:367: in run
    raise patched_exception
.venv/lib/python3.11/site-packages/numba/core/compiler_machinery.py:356: in run
    self._runPass(idx, pass_inst, state)
.venv/lib/python3.11/site-packages/numba/core/compiler_lock.py:35: in _acquire_compile_lock
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/compiler_machinery.py:311: in _runPass
    mutated |= check(pss.run_pass, internal_state)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/compiler_machinery.py:272: in check
    mangled = func(compiler_state)
              ^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/typed_passes.py:112: in run_pass
    typemap, return_type, calltypes, errs = type_inference_stage(
.venv/lib/python3.11/site-packages/numba/core/typed_passes.py:91: in type_inference_stage
    infer.build_constraint()
.venv/lib/python3.11/site-packages/numba/core/typeinfer.py:1027: in build_constraint
    self.constrain_statement(inst)
.venv/lib/python3.11/site-packages/numba/core/typeinfer.py:1389: in constrain_statement
    self.typeof_assign(inst)
.venv/lib/python3.11/site-packages/numba/core/typeinfer.py:1464: in typeof_assign
    self.typeof_global(inst, inst.target, value)
.venv/lib/python3.11/site-packages/numba/core/typeinfer.py:1564: in typeof_global
    typ = self.resolve_value_type(inst, gvar.value)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/numba/core/typeinfer.py:1485: in resolve_value_type
    raise TypingError(msg, loc=inst.loc)
E   numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
E   Untyped global name 'select_singlet_element': Cannot determine Numba type of <class 'function'>
E
E   File "src/eko/evolution_operator/quad_ker.py", line 705:
E   def cb_quad_ker_qcd(
E       <source elided>
E               ) @ np.ascontiguousarray(ker)
E           ker = select_singlet_element(ker, mode0, mode1)
E           ^
E
E   During: Pass nopython_type_inference

In short, the function select_singlet_element is of type function, and not whatever numba wants it to be (i.e. njit function). I have no idea why this is taking place since the function is placed above in the file, and the structure of the file before was the same only but it did not encounter this error. I ran this and the error got fixed:

find . -type d -name "__pycache__" -exec rm -rf {} +
.venv/bin/python -c "import sys; sys.path.insert(0, 'src'); import eko.evolution_operator.quad_ker"

So remove all the cache, and then call only the quad_ker file such that it caches without any problems. Then running poe test does not result in any error.

I had faced this error before. After running rustify.sh, poe test was returning the same error. I imported all the dependencies of quad_ker.py in terminal (similar way as shown above) and the error was resolved.

I have no clue why on Earth numba is showing this behaviour, but we should not split the file if the error persists.

Copilot AI review requested due to automatic review settings June 3, 2026 13:42
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the eko.evolution_operator package by extracting the Numba-based quadrature kernel implementation (quad_ker, QuadKerBase, and related helpers) out of evolution_operator/__init__.py into a dedicated quad_ker.py module, and updates internal imports accordingly.

Changes:

  • Move the quadrature-kernel logic (including QuadKerBase and QCD/QED kernel builders) from evolution_operator/__init__.py into evolution_operator/quad_ker.py.
  • Slim down evolution_operator/__init__.py by removing large Numba-heavy definitions and importing quad_ker from the new module.
  • Update operator_matrix_element.py to import QuadKerBase from quad_ker.py instead of from the package __init__.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/eko/evolution_operator/quad_ker.py Hosts the extracted quadrature-kernel code, adds/relocates helper selectors and QuadKerBase, retains Numba cfunc callbacks.
src/eko/evolution_operator/operator_matrix_element.py Adjusts imports to reference QuadKerBase from the new quad_ker module.
src/eko/evolution_operator/init.py Removes the inlined quad-kernel implementation and imports quad_ker from quad_ker.py.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/eko/evolution_operator/quad_ker.py
Comment thread src/eko/evolution_operator/quad_ker.py
Comment thread src/eko/evolution_operator/quad_ker.py
Comment thread src/eko/evolution_operator/quad_ker.py Outdated
@AkshatRai07
Copy link
Copy Markdown
Collaborator Author

AkshatRai07 commented Jun 3, 2026

So apparently because of the three cb_quad_ker_qcd/qed/ome functions (which are cfunc for the Rust callback) all of this is happening (since they are not disabled by D:NUMBA_DISABLE_JIT=1). No clue why this was not happening before on master but is happening now. This will be solved after we fully convert to Rust.

Unfortunately there is no dedicated NUMBA_DISABLE_CFUNC. One thing which we can do is replace cfunc at the time of testing by adding this at the top of the file:

import os

if os.environ.get("NUMBA_DISABLE_JIT", "0") == "1":

    def cfunc_wrapper(sig, *args, **kwargs):
        def decorator(func):
            return func

        return decorator

    nb.cfunc = cfunc_wrapper

Please do let me know if you are satisfied with this or not so that I can start fixing patch files for LHA Benchmarks (Rust) to run successfully.

@felixhekhorn
Copy link
Copy Markdown
Collaborator

No clue why this was not happening before on master but is happening now. This will be solved after we fully convert to Rust.

Actually, I would say this is a feature and not a bug.

  1. the old layout:
    • recall that @cfunc compiles a C function - which is completely different from a Python function. In particular, while Rust can execute C functions (easily), it can't with Python functions (because they are far more complex due the language). So when we pass a function to Rust, it must be a C function, always - even within testing. Deactivating this feature will break the code.
    • the main reason why we deactivate Numba in testing is for debugging reasons; i.e. with Numba it is impossible to properly debug, so we just deactivate it. However, this should not apply to @cfunc due to the above reason.
  2. the new layout:
    • I suggest to immediately push for the new layout - and there we would not need any @cfunc any longer, since we do not pass Python functions around any longer
    • in order to do so, you can drop all the current content of quad_ker.py - yes, this will break Rust for a moment, but then we fix it in a completely new way: with the new layout

Do you agree? (and to be explicit: Copilot is telling non-sense in this case I would say)

Then there is the unrelated matter of Numba+caching: this yields sometimes strange behaviour since the caching is not always consistent. See also numba/numba#8926 . Thus, the first action in debugging Numba is always to delete the (Python) cache (files) and let Numba recompile; typically it is sufficient to delete "the relevant cache" (which you need to guess) and not all, which for us can take a very long time to recompile.

@felixhekhorn felixhekhorn marked this pull request as draft June 4, 2026 08:11
@AkshatRai07
Copy link
Copy Markdown
Collaborator Author

In summary, today, I removed cfuncs, and I changed the eko crate binding from cffi to pyo3 (which led to changes in root Cargo.lock). Took me some time to figure out it's pretty hard to call a cffi function from an nb.njit function. Also, I knew that njit cannot compile PyO3 object from the GSoC task I did. So the way out of this (for now) is to call such functions from inside objmode. In this setting, numba drops back into Python Interpreter, execute whatever is in it, and then return the result to nopython code. As we expand more and more of the library, we will entirely drop numba, and hence this workaround.

I also changed poe compile to use maturin, added that as a dev-dependency. This led to changes in poetry.lock and lha_bot_rust.yaml.

Comment thread crates/eko/src/lib.rs Outdated
Comment thread crates/eko/src/lib.rs Outdated
Comment thread crates/eko/src/lib.rs Outdated
Comment thread crates/eko/src/lib.rs
/// - [`qcd_gamma_singlet`]: QCD singlet anomalous dimension matrix in Mellin space
#[pymodule]
fn ekors(_py: Python<'_>, m: &Bound<'_, PyModule>) -> PyResult<()> {
m.add_function(wrap_pyfunction!(qcd_gamma_singlet, m)?)?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this works, please also add the non-singlet counterpart

Comment thread crates/eko/Cargo.toml
[dependencies]
num = "0.4.1"
ekore = { path = "../ekore", version = "0.0.1" }
pyo3 = { version = "0.21", features = ["extension-module"] }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please briefly explain to me what the extension-module does?

Copy link
Copy Markdown
Collaborator Author

@AkshatRai07 AkshatRai07 Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when I first did poe compile with maturin, it gave me this warning:

⚠️  Warning: You're building a library without activating pyo3's `extension-module` feature. See https://pyo3.rs/v0.21.2/building-and-distribution.html#the-extension-module-feature

Without extension-module, a pyo3 package links its own libpython into the .so. But when Python imports our package, there's already a libpython running, the Python interpreter itself. Having two copies of libpython in the same process causes conflicts. extension-module disables this. It tells the package to just use the libpython that's already running instead of bundling its own.

The downside is that cargo test will break for any pyo3 functions, since the test binary runs standalone and has no libpython to rely on. This shouldn't be an issue for now as we don't plan to write tests for these.

Also, as per the docs linked in the warning:

Python extensions on Unix must not link to libpython for manylinux compliance.

So extension-module is a necessity here.

Do tell me if you are satisfied by reacting with an emoji so that I can resolve the conversation :)

Comment thread src/eko/evolution_operator/operator_matrix_element.py Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course we need to counteract this deletion (which is the right thing), with the corresponding new patch

Comment thread crates/eko/src/lib.rs
+ order_qcd = order[0]
+ n_re = ker_base.n.real
+ n_im = ker_base.n.imag
+ with nb.objmode(gamma_singlet="complex128[:,:,:]"):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the problem is not so much calling a foreign function in numba, but rather a PyO3 object ...

For the context of #136 (which will introduce another C library into the mess of anomalous dimensions), I was experimenting with how to do that and for plain C functions it seems to be straight:

from numba import njit
import ctypes
DSO = ctypes.CDLL('./shared_library.so')

# Add typing information
c_func = DSO.sum
c_func.restype = ctypes.c_int
c_func.argtypes = [ctypes.c_int, ctypes.c_int]

@njit
def example(x, y):
   return c_func(x, y)

print(example(3, 4)) # 7
// gcc lib.c -fPIC -shared -o shared_library.so
#include <stdint.h>

int64_t sum(int64_t a, int64_t b){
   return a + b;
}

here we don't have a direct C library in hand, so it does not really help us, right?

well, actually it shows another alternative to PyO3: we could add a C interface also to crates/eko (remember that the target of #519 is crates/ekore) and use that under scipy ... in the end we don't care so much crates/eko being a Python library since it is purely internal (since it is just a glue) ....

any thoughts/opinions?

what is the easiest way? which we can grow smoothly? if we ever would want to change strategy, we most likely want to do so in a next step, provided objmode works ...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with using ctypes was passing tuples, lists, and arrays. There was some sort of type mismatch between python and Rust. In the end I just gave up. I thought to myself that all of this is just temporary, let us make a working solution, because in the future we will remove the #[pyfunction] from qcd_quad_ker, and add that macro to the quad_ker function. Ultimately we got to remove this #[pyfunction] entirely and directly pass a single C-ABI function to scipy.integrate.quad with LowLevelCallable.

For #519, I thought you wanted to make that crate to expose the ekore crate to other C/C++ projects and not the python one. The quad function requires almost half of the src/eko python code to be a part of it's hot loop. We can try to add them in the eko_capi, but I don't think we should should change the whole structure of the crate for just one function. Otherwise, we will have call that one function from eko crate, expose it as C-ABI in eko_capi, this will bloat the eko_capi crate as now eko crate will be required for the compilation. After that use the .so file for just one function while it exposes other functions too, again feels bloated to me.

It's better to stick to what we are currently doing.

run: pip install poethepoet
- name: Run benchmark
env:
VIRTUAL_ENV: ${{ env.pythonLocation }}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that needed?

Copy link
Copy Markdown
Collaborator Author

@AkshatRai07 AkshatRai07 Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poe compile did not work here because maturin was not able to identify the virtual environment. This was the root cause told by the error:

For maturin to find your virtualenv you need to either set VIRTUAL_ENV or have a virtualenv called .venv

GitHub Actions' setup-python installs Python at pythonLocation (i.e. /opt/hostedtoolcache/Python/3.12.13/x64) but does not set up a virtual environment. So I pointed VIRTUAL_ENV to pythonLocation, which is already set by setup-python, so maturin can find the correct Python to build against.

Do tell me if you are satisfied by reacting with an emoji so that I can resolve the conversation :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants